13 research outputs found
Generative Adversarial Networks for Bitcoin Data Augmentation
In Bitcoin entity classification, results are strongly conditioned by the
ground-truth dataset, especially when applying supervised machine learning
approaches. However, these ground-truth datasets are frequently affected by
significant class imbalance as generally they contain much more information
regarding legal services (Exchange, Gambling), than regarding services that may
be related to illicit activities (Mixer, Service). Class imbalance increases
the complexity of applying machine learning techniques and reduces the quality
of classification results, especially for underrepresented, but critical
classes.
In this paper, we propose to address this problem by using Generative
Adversarial Networks (GANs) for Bitcoin data augmentation as GANs recently have
shown promising results in the domain of image classification. However, there
is no "one-fits-all" GAN solution that works for every scenario. In fact,
setting GAN training parameters is non-trivial and heavily affects the quality
of the generated synthetic data. We therefore evaluate how GAN parameters such
as the optimization function, the size of the dataset and the chosen batch size
affect GAN implementation for one underrepresented entity class (Mining Pool)
and demonstrate how a "good" GAN configuration can be obtained that achieves
high similarity between synthetically generated and real Bitcoin address data.
To the best of our knowledge, this is the first study presenting GANs as a
valid tool for generating synthetic address data for data augmentation in
Bitcoin entity classification.Comment: 8 pages, 5 figures, 4 table
Cascading Machine Learning to Attack Bitcoin Anonymity
Bitcoin is a decentralized, pseudonymous cryptocurrency that is one of the
most used digital assets to date. Its unregulated nature and inherent anonymity
of users have led to a dramatic increase in its use for illicit activities.
This calls for the development of novel methods capable of characterizing
different entities in the Bitcoin network. In this paper, a method to attack
Bitcoin anonymity is presented, leveraging a novel cascading machine learning
approach that requires only a few features directly extracted from Bitcoin
blockchain data. Cascading, used to enrich entities information with data from
previous classifications, led to considerably improved multi-class
classification performance with excellent values of Precision close to 1.0 for
each considered class. Final models were implemented and compared using
different machine learning models and showed significantly higher accuracy
compared to their baseline implementation. Our approach can contribute to the
development of effective tools for Bitcoin entity characterization, which may
assist in uncovering illegal activities.Comment: 15 pages,7 figures, 4 tables, presented in 2019 IEEE International
Conference on Blockchain (Blockchain
Visual Analytics Platform for Centralized COVID-19 Digital Contact Tracing
The COVID-19 pandemic and its dramatic worldwide impact has required global multidisciplinary actions to mitigate its effects. Mobile phone activity-based digital contact tracing (DCT) via Bluetooth low energy technology has been considered a powerful pandemic monitoring tool, yet it sparked a controversial debate about privacy risks for people. In order to explore the potential benefits of a DCT system in the context of occupational risk prevention, this article presents the potential of visual analytics methods to summarize and extract relevant information from complex DCT data collected during a long-term experiment at our research center. Visual tools were combined with quantitative metrics to provide insights into contact patterns among volunteers. Results showed that crucial actors, such as participants acting as bridges between groups could be easily identified—ultimately allowing for making more informed management decisions aimed at containing the potential spread of a disease.This research work has been carried out within the context of the RAPIDm initiative, fostered by the Basque Government as part of the fast reaction program (PRAP Euskadi, led by SPRI—the entity of the Economic Development, Sustainability, and Environment Department of the Basque Government for promoting the Basque industry) with the aim to boost the Basque industrial sector by maintaining the productive activity in the context of the threat of the COVID-19 pandemic. Three research centers of BRTAn (Basque Research and Technology Alliance) have collaborated in this R&D initiative: Tecnalia, Ikerlan, and Vicomtech. Among the different research lines carried out in the RAPID initiative, Vicomtech has been responsible for the centralized BLE-based DCT system and visual analytics of the obtained data which has been selected as one of the representative cases by the OECDo of pandemic reaction report
12 Temporal graph-based approach for behavioural entity classification
Graph-based analyses have gained a lot of relevance n the past years due to their high potential in describing complex systems by detailing the actors involved, their relations and their behaviours. Nevertheless, in scenarios where these aspects are evolving over time, it is not easy to extract valuable information or to characterize correctly all the actors.
In this study, a two phased approach for exploiting the potential of graph structures in the cybersecurity domain is presented. The main idea is to convert a network classification
problem into a graph-based behavioural one. We extract these graph structures that can represent the evolution of both normal and attack entities and apply a temporal dissection
approach in order to highlight their micro-dynamics. Further, three clustering techniques are applied to the normal entities in order to aggregate similar behaviours, mitigate the imbalance problem and reduce noisy data. Our approach suggests the implementation of two promising deep learning paradigms for entity classification based on Graph Convolutional Networks
Bitcoin and cybersecurity: temporal dissection of blockchain data to unveil changes in entity behavioral patterns
The Bitcoin network not only is vulnerable to cyber-attacks but currently represents the most frequently used cryptocurrency for concealing illicit activities. Typically, Bitcoin activity is monitored by decreasing anonymity of its entities using machine learning-based techniques, which consider the whole blockchain. This entails two issues: first, it increases the complexity of the analysis requiring higher efforts and, second, it may hide network micro-dynamics important for detecting short-term changes in entity behavioral patterns. The aim of this paper is to address both issues by performing a 'temporal dissection' of the Bitcoin blockchain, i.e., dividing it into smaller temporal batches to achieve entity classification. The idea is that a machine learning model trained on a certain time-interval (batch) should achieve good classification performance when tested on another batch if entity behavioral patterns are similar. We apply cascading machine learning principles'a type of ensemble learning applying stacking techniques'introducing a 'k-fold cross-testing' concept across batches of varying size. Results show that blockchain batch size used for entity classification could be reduced for certain classes (Exchange, Gambling, and eWallet) as classification rates did not vary significantly with batch size; suggesting that behavioral patterns did not change significantly over time. Mixer and Market class detection, however, can be negatively affected. A deeper analysis of Mining Pool behavior showed that models trained on recent data perform better than models trained on older data, suggesting that 'typical' Mining Pool behavior may be represented better by recent data. This work provides a first step towards uncovering entity behavioral changes via temporal dissection of blockchain data.This work was partially funded by the European Commission through the Horizon 2020 research and innovation program, as part of the 'TITANIUM' project (Grant Agreement No. 740558)